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Question 1 [6 Marks] 


Answer each of the following multiple choice questions by clearly writing the question 


number and part in your answer book followed by the letter corresponding to your answer. 
(For example: Q1 (i) A.) 


(i) 


(ii) 


(iii) 


A birth is selected at random. Define events B = {the baby is a boy} and 
F = {the mother had the flu during her pregnancy}. The events B and F are 


A. mutually exclusive but not independent. 
B. independent but not mutually exclusive. 
C. mutually exclusive and independent. 


D. neither mutually exclusive nor independent. [1 Mark/ 


Which one of these X variables is a discrete random variable? 


A. An experiment in chemistry is repeated many times and X is the time required 


for a reaction to occur in seconds. 


B. A student is randomly selected and X is the number of correct answers on a six 


question multiple-choice quiz. 


C. An Australia Post package-is randomly selected and _X is the weight in pounds of 
the package. 


D. A student is randomly selected and X is the distance they must travel in metres 
to go from their college room door to the door of their first class on Monday morning. 

[1 Mark] 
Which one of the following ways of collecting data would not result in paired data? 
A. Each person is measured twice. 


B. Similar individuals are paired prior to an experiment. Each individual in a pair 


receives a different treatment. 
C. Two different variables are measured for each person. 


D. Two independent samples are selected and the same response variable is compared 


between samples. _ [1 Mark] 
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(iv) Which of the following statements is correct about a parameter and a statistic asso- 


ciated with repeated random samples of the same size from the same population? 


A. Values of a parameter will vary from sample to sample but values of a statistic 


will not. 
B. Values of both a parameter and a statistic may vary from sample to sample. 


C. Values of a parameter will vary according to the sampling distribution for that 


parameter. 

D. Values of a statistic will vary according to the sampling distribution for that 
statistic. [1 Mark] 
Which of the following relationships could be analyzed using a chi-square test? 

A. The relationship between height (inches) and weight (pounds). 


B. The relationship between satisfaction with K-12 schools (satisfied or not) and 


political party affiliation. 


C. The relationship between gender and amount willing to spend on a stereo system 
(in dollars). 


D. The relationship between opinion on gun control and income earned last year (in 
thousands of dollars). [1 Mark] 
Which one of the following choices describes a problem for which an analysis of 
variance would be appropriate? 


A. Comparing the proportion of successes for three different treatments of anxiety. 


Each treatment is tried on 100 patients. 
B. Analyzing the relationship between high school GPA and college GPA. 


C. Comparing the mean birth weights of newborn babies for three different racial 


groups. 


D. Analyzing the relationship between gender and opinion about capital punishment 


(favor or oppose). [1 Mark] 
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Question 2 | [6 Marks] 


Three fair coins are tossed. The probabilities that 0, 1, 2 or 3 head occur are given in 
Table 1 below. 


Table 1: Probabilities for the outcome of tossing 3 coins 


Let A and B be the events: 


A = {at least one head occurs} 


B = {an odd number of heads occur} 


(a) Find P(A), P(B), P(Aand B) and P(AorB). [2 Marks] 
(b) Find P(A|B) and P(B|A). [2 Marks] 
(c) Are the events A and B independent? Justify your answer. [2 Marks] 
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Question 3 _ | [8 Marks] 
The distribution of haemoglobin in g/dl of blood is approximately N(14,1) in women 


and N(16,1) in men. Answer the following questions, showing all calculations and using 


correct probability notation. 


(a) What is the probability that a woman, chosen at random, has more haemoglobin 
than the mean level for men? [2 Marks] 


(b) A sample of 16 women was selected at random and the mean haemoglobin level was 


calculated. 


(i) What is the sampling distribution of the mean haemoglobin level for women for 
samples of size 16? [2 Marks] 


(ii) What is the probability that a sample of sixteen women, chosen at random, has 


a mean level greater than the mean level for men? [2 Marks] 


(c) Explain the difference in results from (a) and (b) (ii). You may sketch a diagram to 


illustrate your response. [2 Marks] 


Cumulative Probability 
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Question 4 | . [4 Marks] 


A survey of 504 randomly selected US teenagers (aged 15-17) found that 57% of 248 
boys and 70% of 256 girls had online profiles. 


(a) Construct a 95% confidence interval for the difference in proportions of boys and girls 


having online profiles. You may use 


0.57(1 = 0.57) | 0.7(1 = 0.7) 


= 0.042 
248 256 mae 
[2 Marks] 
(b) With reference to the 95% CI write an informative conclusion. [2 Marks] 
Question 5 [12 Marks] 


Many Vietnam war veterans are concerned that their health may have been affected 
by exposure to Agent Orange, a herbicide whose most worrisome component is the highly 
toxic compound dioxin. A study compared the dioxin concentration (in parts per trillion) 
in blood samples from 646 US army veterans who served in areas heavily treated with 
Agent Orange and 97 US army veterans who served outside of Vietnam. Neither sample 


was randomly selected. 


Figure 1: Dioxin Concentration 
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Question 5 is continued on the next page. 
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Table 2: 
mean sd data:n 
OTHER 4.186 2.302 97 
VIETNAM 4.260 2.643 646 


Two Sample t-test 


data: DIOXIN by VETERAN 


t = -0.263, df = 741, p-value = 0.396 


alternative hypothesis: true difference in means is less than 0 
95 percent confidence interval: 
-Inf 0.392 
sample estimates: 
mean in group OTHER mean in group VIETNAM 
4.186 4.260 


Is this an observational study or controlled experiment? [1 Mark] 


Explain why a pooled estimate of the standard deviation is appropriate for this data. 


[1 Mark] 


Showing all calculations, confirm that the value of the test statistic is approximately 


-0.263. [4 Marks] 


Are the conditions for the validity of the t-test satisfied with regard to skewness and 


outliers? Explain your answer. . [2 Marks/ 
Why is a one-sided t-test appropriate for this for this study. [1 Mark] 
Write an informative conclusion. [3 Marks] 
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Question 6 7 [6 Marks] 
Edwin Hubble (Proc. Nat. Acad. Sci. 15(1929), 168-73) observed the the distance and 


recession velocity of 24 extragalactic nebulae. A scatterplot of the data is shown Figure 2 


and a regression analysis in Table 3. (1 megaparsec = 3.09 x 10/9 km). 


Figure 2: Distance vs Velocity for 24 extragalactic nebulae 
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Table 3: Regression Analysis 


Coefficients: 

Estimate Std. Error t value Pr(>|t]) 
(Intercept) 0.399098 0.118470 3.37 0.0028 
VELOCITY 0.001373 0.000227 6.04 4.5e-06 


Residual standard error: 0.405 on 22 degrees of freedom 
Multiple R-squared: 0.624,Adjusted R-squared: 0.606 
F-statistic: 36.4 on 1 and 22 DF, p-value: 4.48e-06 


Question 6 is continued on the next page. 


8 


STAT100 Trimester 3, 2013 


(a) Write the equation of the least squares line that describes the relationship between 


recession velocity and distance. [1 Mark] 


(b) According to cosmological theory the age of the universe (time elapsed since the Big 
Bang) is given by the slope of the regression line. Showing your calculations, confirm 
that the 95% confidence interval for the slope is (0.00184, 0.00090). (This corresponds 
to approximately (0.88, 1.80) billion years.) [2 Marks] 


(c) Showing your calculations, obtain a 99% confidence interval for the y-intercept. The 
Big Bang theory predicts that the y-intercept is zero. Do Hubble’s observations 
support this? [3 Marks] 


[You may take t* = 2.07 for the 95% CI and t* = 2.82 for the 99% Cl] 


Question 7 | | [6 Marks] 


Right trees were grown on each of five types of root-stock in an apple orchard grafting 
experiment. The extension growth (in metres) was measured after four years for each of 


the trees. An dotplot of the data is shown in Figure 3 and summary statistics in Table 4. 


Figure 3: Root-stock and Growth 
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Question 7 is continued on the next page. 


STAT100 Trimester 3, 2013 


Table 4: Summary Statistics 
mean sd data:n 


X1 2.98 0.448 


X2°3..11 0.552 


X3 2.82 0.393 
X4 2.88 0.512 
X5 2.56 0.607 


(a) Part of the Anova table is shown below: 


> summary (apples. anova) 

Df Sum Sq Mean Sq F value Pr(>F) 
root.stock 4 1.36 — 0.28 
Residuals 35 9.03 


Calculate the value of the missing F-statistic. | [2 Marks] 
(b) Test for a difference in mean extension growth among the five root-stocks. /2 Marks/ 


(c) State the necessary conditions for the test. With reference to relevant output, check 


as many of those assumptions as possible. [2 Marks] 


Question 8 [6 Marks] 


Two traits widely studied in tomato plants are height, tall versus dwarf, and leaf type, 
cut versus potato. Tall and cut are dominant. When dihybrids are crossed the phenotypes 
tall-cut, tall-potato, dwarf-cut and dwarf-potato should appear ina 9: 3:3: 1 ratio. In 
an experiment N = 1611 progeny of dihybrids were categorized by phenotype. 


Table 5: Observed Phenotypes 
tall-cut tall-potato dwarf-cut dwarf-potato 


926 288 293 104 


Question 8 is continued on the next page. 
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(a) State the null hypothesis. [1 Mark] 
(b) Write down the expected count for each of the four phenotypes. [1 Mark] 


(c) The test statistic is y? = 1.47. Referring to Figure 4 find the p-value for the test 
statistic. / [1 Mark] 


(d) Using correct probability notation express the p-value in part (c) as a conditional 
probability. [2 Marks/ 


(e) Test the null hypothesis. [1 Mark] 


Figure 4: 


Chi-squared distribution function 
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Please remember- This examination question paper MUST BE HANDED IN. 


Failure to do so may result in the cancellation of all marks for this examination. 


Writing your name and number on the front will help us confirm that your 
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Formulae 


P(A|B) = P(A) if A and B are independent 


P(A and B) = P(A)P(B) if A and B are independent 
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